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Robotic  Uniform  Coverage 
of  Arbitrary-Shaped  Connected  Regions 

Paul  M.  Maxim  and  William  M.  Spears 


Abstract — In  this  article  we  present  a  novel  algorithm  for 
uniform  coverage  of  a  region.  Surveillance,  cleaning,  and  mine 
detection  are  some  applications  that  would  benefit  from  this  type 
of  algorithm.  Prior  work  has  claimed  that  this  task  is  impossible 
to  solve  for  non-convex  regions  [7].  Our  algorithm  enables 
robots  to  uniformly  cover  arbitrary  path-connected  regions,  such 
that  the  robot  movements  are  not  predictable  and  the  region 
periphery  is  not  neglected.  The  algorithm  assumes  that  robots 
are  independent  and  is  physics-based,  relying  on  an  analogy 
with  mean  free  paths  of  particles.  Validation  of  the  algorithm  is 
rigorously  provided  via  simulation  and  real  robot  experiments. 

Index  Terms — uniform  coverage,  arbitrary-shaped  connected 
regions,  mean  free  path,  Markov  chains,  robot  experiments. 

I.  Introduction 

Robotic  coverage  can  be  defined  as  the  problem  of  moving 
a  sensor  or  actuator  over  all  points  in  a  given  region  [10]. 
Robotic  de-mining  operations,  snow  removal,  lawn  mowing, 
car-body  painting,  reconnaissance,  surveillance,  search  and 
rescue  operations,  and  ship  hull  cleaning  are  some  of  the 
multitude  of  tasks  that  coverage  algorithms  can  be  applied 
to.  A  coverage  algorithm  must  generate  what  is  called  a 
coverage  path ,  which  is  a  sequence  of  motion  commands  for 
a  robot.  These  algorithms  can  be  classified  as  either  complete 
or  heuristic  and  randomized. 

By  definition,  complete  algorithms  guarantee  a  path  that 
completely  covers  a  path-connected  [16],  robot-traversable 
region  [4].  Heuristic  algorithms  on  the  other  hand,  use  simple 
rules  that  may  work  very  well,  but  they  have  no  provable 
guarantees  to  ensure  the  success  of  coverage. 

Some  of  the  existing  complete  algorithms  decompose  the 
region  to  be  covered  into  subregions  that  are  simple  to  cover, 
then  a  travelling  salesman  algorithm  is  applied  to  generate 
a  coverage  path  that  covers  each  subregion  in  turn  [10],  [4]. 
Another  approach  is  to  create  an  artificial  potential  field,  where 
a  numerical  value  is  assigned  to  each  cell.  This  value  is  then 
increased  every  time  there  is  a  visit  to  the  cell.  The  next 
step  is  then  taken  to  the  cell  with  the  lowest  value  among 
the  adjacent  cells.  Real-time  complete  coverage  can  also  be 
achieved  by  making  the  robot  mark  its  path,  and  then  avoid 
areas  previously  marked  [6].  The  algorithms  require  prior 
knowledge  of  the  shape  of  the  environment  and/or  the  ability 
to  change  the  environment,  either  physically  or  virtually. 

Heuristic  algorithms  for  coverage  employ  simple  behaviors, 
such  as  spiraling,  room  crossing,  or  wall  following  [11]. 
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More  complicated  actions,  like  explorations,  can  be  achieved 
with  a  hierarchy  of  cooperating  behaviors.  Although  complete 
coverage  is  not  guaranteed,  there  are  advantages  to  this  class  of 
algorithms.  A  cost/benefit  analysis  for  randomized  search  has 
been  performed  separately,  by  Balch  [2]  and  Gage  [7].  Robots 
using  randomized  search  strategies  can  be  less  expensive  than 
robots  using  methods  that  need  precise  localization  hardware. 
Balch  found  that  it  may  be  effective  to  use  a  randomized 
search  if  the  robots  without  localization  capabilities  can  be 
constructed  at  one  fifth  the  cost  of  the  more  expensive  robots. 

Gage  focuses  on  coverage  for  de-mining  purposes  and  he 
takes  into  consideration  the  probability  of  a  robot’s  sensor  to 
detect  a  mine.  The  probability  of  correctly  detecting  a  mine  is 
directly  proportional  to  the  advantages  of  methodical  search 
algorithms  over  the  randomized  search  algorithms.  When  these 
advantages  eventually  disappear,  the  use  of  less  expensive 
hardware  is  preferable. 

Gage  introduces  a  randomized  search  strategy  that  provides 
uniform  coverage  [7]  and  defines  certain  properties  that  a  path 
generator  algorithm  must  take  in  consideration:  (1)  the  path 
should  not  be  predictable  (so  that  an  observer  cannot  predict 
the  robot’s  future  path  by  knowing  its  prior  path),  (2)  the  path 
must  not  be  determined  by  a  fixed  reaction  to  environmental 
features,  and  (3)  the  periphery  of  the  search  space  should 
not  be  neglected.  The  algorithm  presented  is  based  on  diffuse 
reflection  -  how  light  reflects  off  a  matte  surface  -  which  is 
then  generalized  to  any  convex  search  areas.  However,  in  order 
to  implement  this  algorithm,  “the  searcher  must  be  able  to 
determine  its  position  within  the  search  area,  and  the  distance 
to  the  opposite  boundary  in  all  directions”.  Gage  states  that  it 
is  impossible  to  extend  this  strategy  to  generalized  non-convex 
areas,  unless  more  restrictions  are  added. 

We  have  invented  a  novel,  yet  simple,  algorithm  that  meets 
Gage’s  three  properties,  but  works  in  a  broad  class  of  envi¬ 
ronments  that  can  be  tiled  with  square  cells.  The  environment 
must  be  path-connected  and  robot-traversable,  but  can  be 
convex  or  concave.  The  algorithm  is  randomized  and  complete. 

Section  II  introduces  the  environment  generation  tool,  the 
simulation  tool  we  developed  to  test  our  algorithm,  the  per¬ 
formance  metric,  and  the  test  environments.  Two  versions  of 
our  uniform  coverage  algorithm  are  also  presented  in  this 
section.  In  Section  III  we  present  the  theory  that  supports  our 
algorithm.  Validation  of  our  algorithm  with  actual  robot  ex¬ 
periments  is  presented  in  Section  VI.  Section  VII  summarizes 
this  article. 
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II.  Simulation 
A.  Environment  Generator 

Our  environment  generator  allows  us  to  control  the  shape 
and  the  size  of  the  environment.  To  be  able  to  monitor  how 
uniformly  the  robot  is  covering  the  environment,  we  split 
the  environment  into  equal-size  square  cells  (Figure  1).  For 
example,  a  square  environment  that  is  60  x  60  units  can  be 
split  into  9  equal  cells,  each  cell  being  20  x  20  units  (we 
will  henceforth  adopt  the  convention  that  a  cell  that  is  c  x  c 
units  will  be  referred  to  as  a  cell  of  size  c).  This  way  we  can 
monitor  how  often  each  cell  was  visited  and  measure  how  well 
the  algorithm  is  performing.  To  ensure  that  our  algorithm  is 
scale  invariant,  we  also  test  on  scaled-down  versions  of  the 
same  environment,  as  we  explain  next. 
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Fig.  1.  A  60  x  60  units  environment  tiled  with  nine  cells  of  size  20. 

We  created  a  graphical  environment  generator  tool  that 
allows  a  user  to  specify  the  initial  cell  size  and  its  decrement 
factor.  The  decrement  factor  is  used  to  generate  scaled-down 
versions  of  the  generated  environment.  For  example,  if  a  cell 
size  of  30  units  is  chosen  with  a  decrement  factor  of  5  units,  a 
total  of  5  environments  will  be  generated,  having  cell  sizes  30, 
25,  20,  15,  and  10  units.  Note,  this  does  not  change  the  number 
of  cells,  since  the  whole  environment  is  scaled  the  same  way. 
The  Graphical  User  Interface  (GUI)  presents  the  user  a  square 
window  with  a  grid  whose  vertices  are  equally  placed  at  the 
initial  cell  size  intervals.  The  user  then  clicks  on  the  vertices  of 
the  desired  environment.  When  all  the  vertices  are  selected,  a 
right  click  anywhere  on  the  grid  will  cause  the  tool  to  generate 
all  the  environments  and  then  exit.  A  GUI  snapshot  showing 
an  arbitrary  environment  outline  is  presented  in  Figure  2  (the 
generated  output  is  presented  in  Listing  7  from  Appendix  A). 

The  output  files  are  named  environment_X.ini ,  where  X 
represents  the  cell  size.  Using  the  same  example  as  above,  the 
output  files  are:  environment_30.ini ,  environment_25.ini ,  envi- 
ronment_20.  ini ,  environment _1 5.  ini ,  and  environment _1 0.  ini. 
Each  of  these  files  contains  the  total  number  of  vertices  the 
environment  has,  the  coordinates  of  each  of  these  vertices,  the 
total  number  of  cells  and  the  lower-left  corner  coordinates  of 
each  of  the  cells,  as  well  as  the  length  of  one  cell  side.  The 
60  x  60  environment  with  nine  cells  of  size  20  is  presented 
in  Listing  1. 

B.  Simulation  Tool 

Our  simulation  is  programmed  in  the  C  language  and  we 
use  the  Free  OpenGL  Utility  Toolkit  (Freeglut)  version  2.4.0 
library  [19]  to  add  graphic  display  capabilities.  Freeglut  is  an 
open-source  alternative  to  the  OpenGL  Utility  Toolkit  (GLUT) 
library.  We  run  our  simulation  on  platforms  running  Linux- 
based  operating  systems,  but  it  can  easily  be  ported  to  different 
platforms  and  operating  systems. 


Fig.  2.  Environment  generator  snapshot 


Listing  1.  environment_20.ini  example 


vertex_number  4 

0 

0 

0 

60 

60 

60 

60 

0 

cell. 

_number  9 

0 

0 

0 

20 

0 

40 

20 

0 

20 

20 

20 

40 

40 

0 

40 

20 

40 

40 

cell. 

_size  20 

1)  Configuration  and  control:  The  configuration  and  con¬ 
trol  of  the  simulation  tool  can  be  performed  at  two  levels:  be¬ 
fore  and  during  the  simulation.  Before  the  simulation  is  started, 
a  configuration  file  allows  the  user  to  set  certain  parameters 
to  their  desired  values.  Some  of  these  parameters  control  the 
simulation  environment,  like  the  size  of  the  graphics  window, 
the  simulation  speed,  and  the  simulation  length,  while  other 
parameters  control  the  actual  algorithm  and  the  robot,  or  agent, 
being  simulated.  Examples  are  the  obstacle  detection  sensor 
range  and  the  distance  at  which  the  robot  will  react  to  the 
obstacles. 

The  actual  simulation  window  has  three  distinct  areas. 
The  top  area  displays  important  real-time  data  about  the 
current  simulation,  the  middle  area  is  the  largest  and  it  shows 
the  outline  of  the  environment  together  with  the  robot  that 
navigates  through  the  environment.  The  robot  is  shown  as  a 
blue  disk,  with  a  black  dot  marking  its  current  heading  and  a 
red  line  that  shows  the  position  and  the  current  range  of  the 
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Fig.  3.  Uniform  coverage  simulation  engine  snapshot  featuring  a  complex 
84  cell  environment  (cell  size  of  20) 

obstacle  detection  sensor.  A  legend  at  the  bottom  of  the  screen 
informs  the  user  about  the  commands  that  can  be  performed 
while  the  simulation  is  running,  like  increasing  or  decreasing 
the  simulation  speed  or  the  obstacle  detection  sensor  response 
range.  Figure  3  shows  a  snapshot  of  the  simulation  window 
that  features  a  complex  84  cell  environment.  In  this  figure,  the 
robot  is  shown  inside  the  top  chamber  of  the  environment. 

A  file  with  the  environment  configuration  must  be  provided 
to  the  simulation  tool.  For  our  preliminary  experiments  a 
robot  is  run  for  10  million  time  steps  inside  this  environment, 
recording  to  a  file  (every  10,000  time  steps)  the  number 
of  visits  the  robot  has  made  to  each  of  the  cells  in  the 
environment  (the  final  distribution  of  visits  per  cell  is  what 
we  are  most  interested  in).  Initially,  for  each  environment  we 
varied  the  starting  position  of  the  robot  from  being  in  a  corner, 
a  side,  or  an  interior  cell.  Also  the  obstacle  detection  sensor 
response  range  was  set  to  1.0  unit  to  meet  Gage’s  condition 
that  the  periphery  should  not  be  neglected.  To  avoid  wall 
collisions  the  robot  velocity  was  set  to  0.8  units/step. 

C.  Kullback-Leibler  Divergence 

After  we  run  our  simulation  tool  on  an  environment  we  get 
a  distribution  that  shows  us  how  many  visits  each  cell  in  the 
environment  has.  Rather  than  just  rely  on  visual  inspection, 
a  metric  is  necessary  to  formally  measure  the  difference 
between  this  observed  distribution  and  the  optimum  uniform 
distribution. 

In  probability  theory,  the  Kullback-Leibler  divergence  is 
a  non-commutative  measure  of  the  difference  between  two 
probability  distributions  P  and  Q.  Typically  P  represents  the 
“true”  distribution  of  data,  observations,  or  a  precise  calculated 


theoretical  distribution.  The  measure  Q  typically  represents  a 
theory,  model,  description,  or  approximation  of  P  [12]. 

In  our  case  P  represents  the  observed  distribution  and  Q 
represents  the  optimum  uniform  distribution.  The  minimum  di¬ 
vergence  is  0.0,  meaning  the  observed  distribution  is  precisely 
the  same  as  the  uniform  distribution.  The  K-L  divergence  of 
Q  from  P  is  defined  to  be: 

DKL(P\\Q)  =  J2P(i)log^  (1) 

In  general,  a  K-L  divergence  of  0.01  is  considered  to  be 
an  excellent  match  between  the  two  distributions  [1].  For  our 
results  with  real  robots  our  goal  is  for  a  K-L  divergence  of 
approximately  0.01.  For  simulation  results  our  goal  is  0.001 
since  we  can  run  the  simulation  for  a  far  greater  number  of 
steps. 

D.  Markov  Chain  Analysis 

Markov  chains  are  used  in  mathematics  to  describe  the 
future  states  of  a  stochastic  process  based  on  the  present 
state  [8].  The  future  states  are  independent  of  the  past  state 
and  are  reached  through  a  probabilistic  process.  A  change  of  a 
state  is  called  a  transition  and  the  probability  that  governs  this 
change  is  called  a  transition  probability.  A  probability  matrix 
Q  defines  the  probability  of  transitioning  to  a  state  j  in  the 
next  step,  given  the  system  is  currently  in  state  i  (i.e.,  Q(i,  j)). 

A  Markov  chain  is  ergodic  if  all  states  are  recurrent, 
aperiodic,  and  communicate  with  each  other.  A  state  i  is 
transient  if  there  exists  a  state  j  that  is  reachable  from  i,  but 
state  i  is  not  reachable  from  j.  If  a  state  is  not  transient,  it  is 
recurrent.  Also,  a  state  i  is  periodic  if  all  paths  leading  from  i 
back  to  i  have  a  length  that  is  multiple  of  some  integer  k  >  1. 
If  this  is  not  the  situation,  the  state  is  aperiodic.  All  states 
communicate  with  each  other  if  there  is  a  path  from  any  state 
i  to  any  state  j  (and  vice  versa). 

Every  ergodic  Markov  chain  has  a  steady-state  distribution 
that  describes  the  long-term  behavior  of  the  system.  The 
existence  of  a  steady-state  distribution  indicates  that  the  initial 
state  of  the  system  is  of  no  importance  in  the  long  term. 
Furthermore,  if  Q  is  symmetric  =  Q(j,  i)),  then 

the  steady-state  distribution  is  the  uniform  distribution.  This 
latter  property  is  crucial  to  achieving  a  good  uniform  coverage 
algorithm. 

Here  we  assume  that  a  state  is  simply  the  cell  that  the  robot 
is  in.  The  robot  transitions  from  cell  to  cell  as  it  moves.  Hence, 
if  our  algorithm  for  moving  the  robot  can  be  described  as  an 
ergodic  Markov  chain  with  a  symmetric  probability  transition 
matrix,  the  steady-state  distribution  must  be  uniform  and  the 
algorithm  is  correct.  The  algorithm  is  also  complete  if  the 
environment  is  path-connected  and  robot-traversable. 

Consider  the  nine-cell  environment  shown  in  Figure  1 .  Let 
us  also  consider  an  idealized  situation  where  a  robot  can  reside 
at  the  center  of  cells.  For  example,  suppose  the  robot  is  at  the 
center  of  cell  4.  The  robot  could  then  transition  to  adjacent 
cells  1,  3,  5,  or  7.  The  probability  of  transitioning  directly  to 
cells  0,  2,  6,  or  8  is  essentially  zero  (i.e.,  the  robot  would 
have  to  pass  through  cells  1,  3,  5,  or  7  first  to  get  to  the 
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other  cells).  Hence  we  will  ignore  the  “diagonal”  neighbors. 
An  ideal  algorithm  is  as  follows.  The  robot  turns  uniformly 
randomly  and  then  attempts  to  move.  It  might  stay  in  the  same 
cell  (if  the  cell  is  along  the  border  of  the  environment),  or  it 
might  enter  an  adjacent  cell.  If  it  enters  an  adjacent  cell  we 
assume  it  then  moves  to  the  center  of  that  cell,  so  that  the 
algorithm  can  continue. 

The  Markov  chain  for  this  idealized  algorithm  is  trivial  to 
compute.  If  a  cell  has  b  periphery  edges,  then  there  is  a  6/4 
probability  of  staying  in  that  cell.  The  remaining  probability 
mass  is  uniformly  distributed  to  adjacent  cells. 

The  probability  transition  matrix  Q  for  the  nine-cell  envi¬ 
ronment  is: 
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0.00 

0.25 

0.00 

0.25 

0.50_ 

(2) 


This  matrix  is  ergodic  and  the  steady-state  distribution 
indicates  that  each  cell  is  visited  precisely  l/9th  of  the  time, 
which  is  what  we  desire. 

It  is  important  to  point  out  that  this  is  an  existence  proof 
that  an  algorithm  may  exist  that  provides  uniform  coverage  for 
all  path-connected  environments  that  can  be  tiled  with  square 
cells.  The  algorithm  given  above  is  not  realistic.  However,  the 
argument  above  does  provide  guidance  on  how  to  create  a 
viable  algorithm:  the  resulting  Markov  chain  must  be  ergodic 
and  have  a  symmetric  probability  transition  matrix. 

E.  First  Algorithm 

Our  first  algorithm  requires  only  that  the  robot  be  able  to 
sense  an  object  (wall)  in  front  of  it.  The  robot  starts  by  moving 
forward.  When  it  senses  an  object  directly  in  front  of  it,  the 
robot  stops.  Then  the  robot  makes  a  uniform  [ — 7r,  7t]  random 
turn.  If  there  is  still  an  object  in  front,  the  robot  turns  again. 
The  robot  continues  to  turn  until  it  detects  no  obstacle  in  front 
of  it.  Then  the  robot  moves  forward  again.  This  algorithm  does 
not  avoid  the  periphery,  it  is  not  predictable,  and  it  works  in 
all  path-connected  environments.  Proof  of  this  claim  lies  in 
the  assumption  of  tiling  the  environment  with  square  cells.  The 
resulting  Markov  chain  is  ergodic  and  will  have  a  steady-state 
distribution.  However,  it  is  not  clear  whether  the  steady-state 
distribution  is  uniform. 

F.  Environments 

The  first  test  of  our  algorithm  was  on  a  simple  40  x  40 
units  square  environment,  with  four  cells  of  size  20.  As  seen 
in  Listing  2,  the  distribution  is  very  close  to  uniform.  This 
is  reasonable,  since  all  four  cells  are  corner  cells  and  can  be 
considered  equivalent. 


Listing  2.  Visits  in  the  four  cell  square  environment 

cell  0:  2503625 
cell  1:  2500184 
cell  2:  2494401 
cell  3:  2501790 


The  next  tested  environment  is  the  60  x  60  units  square 
environment  shown  in  Figure  1,  with  nine  cells.  This  provides 
us  with  different  categories  of  cells,  such  as  corners,  sides,  and 
an  inner  cell.  The  results  for  this  environment  are  presented  in 
Listing  3.  We  can  see  that  the  distribution  is  not  uniform.  In 
fact,  the  center  cell  (4)  has  the  least  amount  of  visits.  The  main 
reason  is  the  inner  position  of  this  cell.  The  robot  does  not 
detect  any  obstacles  in  this  cell  and  spends  the  least  amount 
of  time  there  because  it  moves  straight  through  it.  Clearly, 
this  first  algorithm  is  not  adequate,  since  the  probability  of 
transitioning  from  the  center  cell  to  any  other  cell  is  higher 
than  transitioning  from  any  other  cell  to  the  center  cell.  A 
similar  difference  exists  between  the  corner  cells  and  side 
cells. 


Listing  3.  Visits  in  the  nine  cell  square  environment  (1st  algorithm) 


cell  0 

1242516 

cell  1 

1048036 

cell  2 

1229541 

cell  3 

1042531 

cell  4 

888209 

cell  5 

1052790 

cell  6 

1225855 

cell  7 

1041632 

cell  8 

1228890 

Taking  random  turns  only  when  detecting  a  wall  is  not 
sufficient.  This  tells  us  that  in  certain  situations,  Q(i,j)  ^  Q(j,i)- 
An  improvement  has  to  be  brought  to  this  algorithm.  This 
improvement  is  inspired  from  physics,  as  we  describe  next. 

G.  Second  Algorithm 

The  prior  results  on  the  nine  cell  environment  indicate  that 
the  transition  probabilities  are  not  symmetric.  In  fact,  the 
results  indicate  that  the  robot  needs  to  spend  more  time  in 
cells  with  less  peripheral  edges.  This  can  be  achieved  using  the 
concept  of  mean  free  path  from  physics.  In  physics,  a  molecule 
in  a  gas  moves  with  constant  speed  along  a  straight  line 
between  successive  collisions.  The  mean  free  path  is  defined 
as  the  average  distance  between  such  successive  collisions  [9] . 
Depending  on  the  number  of  molecules  in  the  space  available 
to  them,  the  mean  free  path  can  range  from  zero  to  infinite. 

In  the  first  algorithm,  the  only  time  the  robot  changes 
direction  is  when  it  senses  a  wall.  Hence  it  spends  less  time  in 
cells  with  fewer  peripheral  edges.  In  order  to  counteract  this, 
we  want  the  robot  to  act  as  if  it  collides  with  another  virtual 
object  occasionally.  Hence  we  want  the  robot  to  randomly 
change  direction  after  it  has  moved  a  certain  distance,  namely, 
its  mean  free  path.  Since  the  mean  free  path  needs  to  be 
normalized  by  the  cell  size,  we  assume  the  mean  free  path 
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will  be  /  x  c,  where  /  is  some  real  number,  and  c  is  the  cell 
size. 

We  empirically  determined  that  /  ~  0.6  works  best.  If  we 
rerun  the  same  experiment  with  the  nine-cell  environment, 
the  results  are  shown  in  Listing  4.  We  can  easily  see  the 
improvement  of  this  version  of  the  algorithm  over  the  previous 
version.  By  using  the  Kullback-Leibler  (K-L)  divergence 
metric  we  can  measure  how  close  this  distribution  is  to  the 
uniform  distribution.  For  this  experiment  the  K-L  divergence 
is  0.000060.  Remember  that  the  closer  this  metric  is  to  0.0, 
the  better.  To  give  a  better  understanding  of  how  the  K-L 
metric  changes  over  time,  Figure  4  shows  the  graph  of  the 
K-L  divergence  metric  every  10,000  time  steps  for  10  million 
time  steps.  This  graph  is  not  smooth  because  it  represents  only 
one  simulation  run. 


Listing  4.  Visits  in  the  nine-cell  square  environment  (2nd  algorithm) 


f 

cell 

0 

1098245 

cell 

1 

1110451 

cell 

2 

1112045 

cell 

3 

1124196 

cell 

4 

1127813 

cell 

5 

1124090 

cell 

6 

1089300 

cell 

7 

1111144 

cell 

8 

1102716 

Time 


Fig.  4.  K-L  divergence  metric  for  the  nine-cell  square  environment 


It  is  important  to  note  that  because  the  Markov  chain  is 
ergodic  the  time  average  of  our  algorithm  converges  to  the 
ensemble  average  that  would  be  computed  by  running  the 
algorithm  multiple  times  [21].  Hence,  it  isn’t  necessary  to 
perform  multiple  runs  -  the  long  term  behavior  of  the  algo¬ 
rithm  represents  the  ensemble  average  and  more  realistically 
describes  the  behavior  of  a  robot,  which  we  assume  has  a 
very  long-term  presence  in  the  environment,  performing  the 
desired  coverage  task. 


III.  Derivation  of  the  Mean  Free  Path 

The  mean  free  path  can  be  thought  of  as  the  average  distance 
travelled  by  a  robot  to  move  from  one  cell  to  another  adjacent 
cell.  Consider  Figure  5,  with  two  adjacent  cells.  Each  cell  is 
lxl  units  in  size.  Without  loss  of  generality,  assume  robot 
A  is  located  inside  the  left  cell  at  coordinates  (+1,7/1).  We 
would  like  to  compute  the  average  distance  travelled  by  the 
robot  to  barely  enter  the  right  cell  at  location  (1  +  e,  7/2)- 


A 


Fig.  5.  What  is  the  average  distance  travelled  by  A  to  enter  the  right  cell? 

The  simplest  way  to  compute  this  distance  is  via  simulation 
as  shown  in  Listing  5.  Function  u01()  returns  floating  point 
values  uniformly  randomly  from  0  to  1 .  This  simulation  yields 
a  mean  free  path  of  0.6517. 


Listing  5.  Mean  free  path  simulation  -  C  code 


10 


15 


int  main (  int  argc,  char**  argv  )  { 

int  i,  N; 

double  sum,  xl,  x2,  yl,  y2; 

N  =  10000000; 
sum  =  0.0; 

for  (i  =  1;  i  <=  N;  i++)  { 

xl  =  uOl  ()  ; 
x2  =  1.001; 
yl  =  uOl  ()  ; 
y2  =  uOl  ()  ; 

sum  =  sum  +  sqrt ( (x2  -  xl) * (x2  -  xl)  +  \ 
(y2  -  yl)  *  (y2  -  yl)  )  ; 

} 

print f ( "Average  length  =  %f\n",  sum/N) ; 
return  0; 


A  more  formal  derivation  can  be  accomplished  by  comput¬ 
ing  the  average  distance  using  a  definite  integral: 


V((  1  _a;i)2  +  (yi  -  uiW)  dx1dy1dy2  (3) 

The  solution  to  the  indefinite  integral  [18]  is: 


3y24  —  28yiy23  +  42yi2y22  — 


F(x1,y1,y2)  = 

36(1  -  xi)y22yjy22  ~  2 2/12/2  +  (1  -  xi)2  +  yi2  ~  28y\3y2  + 

48(1  -  x\)3y2  In  (~y2  +  yi  +  \J y22  ~  2yiy2  +  (1  -  xi)2  +  yi2  )  + 

72(1  -  x±)yiy2 y22  -  2y1y2  +  (1  -  xi)2  +  yi2  - 

12(2/2  -  yi)4  In  (l  -  X!  +  \J y22  -  2y\y2  +  (1  -  xi)2  +  yi2  )  + 

48(1  -  xi)3yi  In  ( y2  -  yi  +  \J y22  -  2yiy2  +  (1  -  x\)2  +  yi2  )  + 

24(1  -  xi)3^/y22  -  2 i/ii/2  +  (1  -  xi)2  +  2/1 2  - 
36(1  -  xi)yi2\Jy22  -  2yiy2  +  (1  -  xi)2  +  yi2 


288 


(4) 
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Given  this,  we  can  evaluate  the  definite  integral  to  be  [17]: 

F(  1, 1, 1)  -  F(  1,  0, 1)  -  F(0, 1, 1)  +  F( 0, 0, 1)  - 

F(1,1,0)  +F(1,0,0)  +  F(0, 1,0)  -F(0,0,0)  =  .651757  U 

This  conforms  with  the  value  computed  using  Listing  5. 
In  experiments,  we  noted  that  /  =  0.60  works  very  well, 
which  is  slightly  lower  than  theory.  The  difference  is  caused 
by  the  fact  that  the  robot  must  stay  1  unit  away  from  the  walls, 
which  in  turn  makes  the  length  of  the  mean  free  path  slightly 
smaller.  For  example,  we  generally  use  cells  of  size  20.  Thus, 
having  a  wall  response  range  of  1  unit  provides  a  5%  buffer 
zone.  Assuming  a  5%  buffer,  the  theoretical  mean  free  path  is 
lowered  to  0.6079,  which  is  extremely  close  to  our  empirically 
derived  value. 

It  is  important  to  note  that  the  concept  of  mean  free  passage 
has  allowed  us  to  make  the  probability  transitions  symmetric 
(Q(i,  j)  =  Q(j,  i)).  This  can  be  observed  by  noting  that  robot 
A  could  be  in  any  cell  of  any  square-tiled  environment,  and 
the  mean  free  passage  is  the  average  distance  travelled  to  enter 
any  adjacent  cell,  regardless  of  whether  these  cells  are  on  the 
periphery  or  not. 


IV.  Further  Confirmation 

We  ran  our  simulation  over  a  large  variety  of  environments 
and  we  generally  obtain  a  K-L  divergence  number  of  0.001 
or  lower  (the  mean  over  all  environments  is  0.00066  with  a 
standard  deviation  of  0.00068),  which  indicates  that  the  behav¬ 
ior  is  extremely  close  to  the  optimum.  Each  simulation  is  run 
40  million  time  steps.  Table  I  provides  results  over  numerous 
environments  and  cell  sizes.  The  environment  outlines  are 
presented  in  Figures  21,  22,  and  23  from  Appendix  A. 


(a)  -  10,000  time  steps 


(c)  -  1,500,000  time  steps 


(b)  -  100,000  time  steps 


(d)  -  5,000,000  time  steps 


TABLE  I 

K-L  DIVERGENCE  METRIC  FOR  DIFFERENT  ENVIRONMENTS  AND  CELL 

SIZES 


Environment 

Cell  Size  (units) 

10 

15 

20 

25 

30 

3x1 

0.00011 

0.00005 

0.00003 

0.00003 

0.00002 

4cellL 

0.00026 

0.00022 

0.00019 

0.00016 

0.00022 

4cellRoll 

0.00090 

0.00082 

0.00068 

0.00066 

0.00071 

7cell 

0.00078 

0.00075 

0.00064 

0.00056 

0.00061 

3x3 

0.00018 

0.00010 

0.00007 

0.00007 

0.00006 

15  cell 

0.00019 

0.00010 

0.00015 

0.00016 

0.00019 

22cell 

0.00088 

0.00124 

0.00188 

0.00137 

0.00145 

3  5  cell 

0.00138 

0.00079 

0.00067 

0.00067 

0.00106 

6x6 

0.00021 

0.00011 

0.00009 

0.00008 

0.00009 

82cell 

0.00223 

0.00114 

0.00269 

0.00097 

0.00059 

84cell 

0.00210 

0.00309 

0.00067 

0.00156 

0.00064 

10x10 

0.00051 

0.00057 

0.00009 

0.00057 

0.00065 

Figure  6  shows  snapshots  in  time  with  one  simulated  robot 
uniformly  covering  the  rather  complex  84  cell  environment. 
The  grey-scale  denotes  the  frequency  that  a  particular  cell 
is  occupied  (darker  means  less  often).  The  uniformity  of 
the  grey  cells  in  the  last  snapshot  (lower  right)  indicates 
excellent  uniform  coverage.  The  K-L  divergence  graph  for 
this  environment  is  shown  in  Figure  7  and  at  the  last  step  in 
the  simulation  its  value  is  0.00067. 


(e)  -  10,000,000  time  steps  (f)  -  40,000,000  time  steps 

Fig.  6.  One  simulated  Maxelbot  obtaining  uniform  coverage  in  a  complex 
84  cell  environment  -  snapshots  taken  at  the  time  steps  shown  above 


Steps 

Fig.  7.  K-L  divergence  metric  for  a  simulated  84  cell  environment 
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V.  Theory  for  Multiple  Robots 

One  robot  does  very  well  at  uniformly  covering  an  environ¬ 
ment  using  our  algorithm.  We  have  shown  that,  if  a  connected 
environment  is  composed  of  C  square  cells,  in  the  long  term 
there  is  a  ^  probability  of  the  robot  being  in  each  cell. 

Let  us  assume  we  have  a  team  of  N  independent  robots 
performing  the  same  algorithm  in  a  large  environment,  such 
that  their  interactions  are  minimal  (they  merely  have  to  avoid 
each  other  on  rare  occasions).  For  example,  N  G  [10,  50]  is 
very  reasonable  for  systems  that  are  usually  deployed  in  the 
field.  The  probability  of  one  robot  being  in  one  of  the  cells 
is  The  probability  of  one  robot  not  being  in  that  cell  is 
1  —  (^) .  The  probability  of  all  N  robots  not  being  in  that  cell 
is  then  (l  —  (^r))^.  Hence,  the  probability  that  at  least  one 
robot  is  in  that  cell  is: 


Now  suppose  we  require  some  minimum  probability  such 
that  some  robots  will  be  in  any  cell  at  any  time  step.  Let  us 
call  this  probability  Pm.  Then  we  need  to  set  the  number  of 
robots  N  high  enough,  such  that: 


Now  we  can  easily  compute  the  number  of  robots  N  that 
we  need  to  satisfy  the  constraints  imposed  by  the  task. 

VI.  Hardware  Implementation 

Moving  from  the  simulation  world  to  the  real-world  appli¬ 
cation  is  not  necessarily  a  small  step.  Before  we  can  do  this 
move,  we  need  to  carefully  choose  the  real-world  equivalents 
for  the  two  most  important  simulation  measuring  units.  First, 
the  unit  to  measure  distance  in  the  simulation  world  is  1 
pixel,  and  we  translate  that  to  2.54  centimeters  in  the  real 
world.  Second,  time  is  measured  in  “steps”  in  the  simulation, 
and  we  translate  that  to  seconds  in  the  real  world.  Having 
defined  these  basic  units  of  measure,  we  can  easily  translate 
the  important  variables,  like  the  maximum  speed  of  the  robot 
and  the  obstacle  detection  response  range. 

The  hardware  experiments  are  performed  using  our  Max- 
elbots.  The  Maxelbot  is  a  robot  platform  equipped  with 
a  trilateration-based  localization  system,  developed  in  our 
laboratory  [20],  [14],  [15].  The  trilateration  module  includes 
three  acoustic  transducers,  three  parabolic  cones,  and  a  radio 
(RF)  transceiver.  The  parabolic  cones  convert  the  acoustic 
energy  into  the  horizontal  plane.  When  a  Maxelbot  “pings,” 
it  emits  an  RF  pulse  from  the  transceiver  and  an  acoustic 
pulse  from  one  transducer.  When  other  Maxelbots  receive 
the  RF  pulse,  they  start  listening  for  the  acoustic  pulse, 
using  all  three  transducers.  The  time  of  flight  is  measured 
at  each  transducer,  which  is  converted  to  distance  using  the 
speed  of  sound.  These  three  distances  yield  the  range  and 
bearing  of  the  “pinging”  Maxelbot.  Hence,  our  trilateration 
is  a  one-to-many  protocol,  allowing  multiple  Maxelbots  to 
simultaneously  trilaterate  and  determine  the  position  of  the 


transmitting  Maxelbot.  The  effective  sensing  range  is  roughly 
three  meters,  due  to  attenuation  of  the  acoustic  pulse.  The  RF 
can  be  heard  to  at  least  ten  meters. 

In  the  following  experiments  the  trilateration-based  local¬ 
ization  module  is  used  in  a  novel  way.  One  Maxelbot  is  moving 
inside  the  environment  (the  “coverage  robot),  while  stationary 
Maxelbots  (“monitor  robots”),  embedded  in  the  walls  of  the 
environment,  record  its  position  at  all  times. 

The  trilateration  module  on  the  moving  Maxelbot  emits 
one  RF-acoustic  pulse  every  120  milliseconds.  This  translates 
to  8.33  pulses  per  second  or  a  frequency  of  8.33  Hz.  This 
is  important  in  order  to  compute  the  time  passed  from  the 
number  of  trilateration  readings.  For  example,  100  trilateration 
readings  are  equivalent  to  roughly  12  seconds  of  run  time.  To 
detect  obstacles,  the  moving  Maxelbot  is  equipped  with  our 
Obstacle  Detection  Module  [14],  which,  in  this  setup,  uses 
only  two  Infra  Red  sensors.  The  two  IR  sensors  are  placed  at 
the  front  corners  of  the  Maxelbot  and  are  aimed  straight  ahead. 
We  use  two  IR  sensors  instead  of  one  (as  in  the  simulation)  to 
enhance  wall  detection  and  minimize  collisions  with  the  walls. 
The  robot  stops  and  turns  if  either  sensor  detects  an  object 
within  the  response  range.  For  our  experiments  the  IR  sensors 
are  polled  once  per  second.  With  maximum  battery  charge,  the 
Maxelbot  travels  approximately  22.86  centimeters  per  second. 
To  prevent  collisions  with  the  walls,  the  response  range  is 
set  to  34.29  centimeters  (to  cope  with  robot  momentum).  The 
main  body  of  code  is  presented  in  Listing  6  in  Appendix  A. 

The  embedded  stationary  monitor  robots  monitor  the  posi¬ 
tion  of  the  single  moving  coverage  robot  and  send  the  data 
in  real  time  to  a  PC  over  a  serial  wired  connection  (this 
was  eventually  replaced  with  a  wireless  Bluetooth  connection, 
as  described  in  [15]).  This  data  allows  us  to  verify  that  the 
moving  robot  does  indeed  visit  all  the  “cells”  equally  often. 
All  the  environments  presented  next  are  broken  into  virtual 
cells,  all  cells  having  the  same  size,  which  is  0.762  meters  x 
0.762  meters.  Each  of  the  corners  of  these  virtual  cells  have 
their  coordinates  preset.  The  number  of  visits  to  one  of  these 
virtual  cells  translates  to  how  many  trilateration  readings  were 
performed  inside  the  perimeter  of  the  cell.  This  allows  us  not 
only  to  count  how  many  times  the  perimeter  of  any  cell  was 
crossed,  but  also  the  amount  of  time  the  robot  has  spent  inside 
the  cell  perimeter. 

Next,  in  this  section  we  present  three  different  environments 
with  increased  complexity  that  were  chosen  for  these  exper¬ 
iments.  For  each  environment  we  present  the  experimental 
setup  and  how  well  uniformity  is  achieved  by  looking  both  at 
the  Kullback-Leibler  divergence  metric  and  a  graphical  off¬ 
line  rendering  of  the  frequency  of  the  cell  occupancy. 

A.  Square  Environment 

The  first  environment  in  which  we  test  our  uniform  coverage 
algorithm  is  a  small,  1.524  meters  x  1.524  meters,  square  en¬ 
vironment.  As  mentioned  earlier,  the  cell  size  is  0.762  meters, 
which  yields  a  2  x  2  cell  environment.  In  this  environment 
all  cells  are  equivalent.  The  experiment  serves  as  a  control 
study  to  ensure  that  no  unexpected  biases  have  occurred  in 
our  hardware  implementation. 
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Fig.  8.  Square  environment  (1.524  meters  x  1.524  meters) 

Figure  8  shows  a  picture  of  the  experimental  setup  that  we 
built  in  our  Distributed  Robotics  Laboratory  at  the  University 
of  Wyoming.  Each  tile  on  our  laboratory  floor  is  exactly  30.48 
x  30.48  centimeters,  and  this  makes  it  easy  for  the  placement 
of  the  “walls”,  which  are  pine  wood  beams,  4.572  meters 
long,  15.24  centimeters  wide,  and  25.4  centimeters  tall.  For 
this  particular  experiment  we  also  used  a  top  of  a  wood  table, 
which  is  exactly  1.524  meters  long,  for  one  of  the  walls.  In  the 
lower  right  corner  of  the  picture  there  is  one  stationary  monitor 
robot  (inside  the  outline)  that  is  performing  the  localization, 
based  on  the  pulses  emitted  by  the  other  moving  coverage 
robot  (shown  inside  the  square  environment).  The  monitor 
robot  is  connected  to  a  serial  port  on  a  PC  in  order  to  transmit 
real-time  data  on  the  position  it  senses  the  coverage  robot. 

Based  on  simulation  results,  we  estimated  that  20  minutes 
would  be  sufficient  to  yield  a  K-L  divergence  less  than  0.01. 
This  generated  a  little  over  9,700  trilateration  readings.  Table  II 
shows  the  virtual  cell  division  for  this  environment,  which  was 
used  during  the  data  analysis.  A  trace  of  the  path  taken  by  the 
moving  Maxelbot  is  shown  in  Figure  9.  The  snapshots  were 
taken  at  0.2,  1,  3  and  20  minutes  and  show  that  after  only  3 
minutes,  most  of  the  environment  was  already  covered. 

Another  useful  graphical  representation  of  this  data  is  shown 
in  Figure  10.  Each  of  the  four  snapshots  were  taken  at  the  same 
time  the  path  trace  snapshots  in  Figure  9  were  taken,  and  show 
the  frequency  of  occupancy  of  the  four  cells  (brighter  means 
the  cell  was  visited  more  often).  The  exact  number  of  cell 
visits  for  each  cell  are  shown  in  Table  III. 

This  shows  that  our  cell  occupancy  distribution  is  very  close 
to  the  uniform  distribution  and  that  there  do  not  appear  to  be 
any  biases  in  our  implementation.  The  evolution  over  time  of 
this  metric  is  shown  in  Figure  1 1 .  The  results  indicate  that  the 
K-L  divergence  dropped  to  0.01  after  only  5,000  steps  and 
then  dropped  further  to  0.0012. 

B.  Rectangular  Environment 

For  a  more  difficult  experiment  we  used  a  larger  1.524 
meters  x  4.572  meters  environment.  This  configuration  yields 


TABLE  II 

Cell  division  ol  the  1 .524  meters  x  1 .524  meters  environment 


3 

1 

2 

0 

(c)  -  3  minutes  (d)  -  20  minutes 

Fig.  9.  Maxelbot  path  trace  in  the  1.524  meters  x  1.524  meters  environment 
-  snapshots  taken  at  the  above  shown  times 


(a)  -  0.2  minutes  (b)  -  1  minute 


(c)  -  3  minutes  (d)  -  20  minutes 

Fig.  10.  Cell  occupancy  in  the  1.524  meters  x  1.524  meters  environment  - 
snapshots  taken  at  the  times  shown  above 
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TABLE  III 

SPECIFIC  CELL  VISITS  FOR  THE  1 .524  METERS  X  1 .524  METERS 
ENVIRONMENT 


Cell 

number 

Time  -  minutes  (readings  x  100) 

0.2 

(1) 

1 

(5) 

3 

(15) 

-20 

(97) 

0 

0 

58 

270 

2,576 

1 

0 

104 

391 

2,464 

2 

43 

116 

271 

2,253 

3 

57 

222 

568 

2,407 

K-L  div. 

- 

0.110353 

0.050299 

0.001158 

Steps 

Fig.  11.  K-L  divergence  for  the  1.524  meters  x  1.524  meters  environment 

a  2  x  6  cell  environment  (with  0.762  meters  cell  size). 
Not  only  is  the  environment  three  times  larger  but  it  is  also 
composed  of  two  categories  of  cells  -  corner  cells  and  edge 
cells.  Table  IV  shows  how  we  numbered  each  of  the  12  cells. 
Although  it  is  possible  for  one  monitor  robot  to  “hear”  the 
RF-acoustic  pulse  generated  by  the  moving  coverage  robot, 
the  accuracy  of  the  localization  reduces  drastically  above  three 
meters.  The  solution  to  this  problem  is  to  use  two  monitor 
robots,  one  at  each  end  of  the  rectangular  environment.  The 
picture  of  the  experimental  setup  is  shown  in  Figure  12.  The 
two  Maxelbots  used  for  monitoring  are  highlighted  with  the 
circular  outlines  and  the  coverage  robot  is  featured  in  the 
center  of  the  environment.  Both  monitor  robots  monitor  the 
whole  environment. 

TABLE  IV 

Cell  division  of  the  4.572  meters  x  4.572  meters  environment 
(THE  UPPER  Maxelbot  MONITORS  CELLS  0,  1,2,  6,  7,  AND  8  AND  THE 
LOWER  Maxelbot  MONITORS  CELLS  3,  4,  5,  9,  10,  AND  11) 


6 

0 

7 

1 

8 

2 

9 

3 

10 

4 

11 

5 

Fig.  12.  Rectangular  environment  (1.524  meters  x  4.572  meters) 

During  this  experiment  there  are  two  sets  of  trilateration 
readings  that  are  generated  (one  from  each  monitor  robot). 
Since  data  can  be  lost  occasionally,  when  sent  over  the  RF, 
it  is  possible  for  the  two  data  sets  to  lose  synchronization. 
This  would  make  it  very  difficult  to  merge  the  data  from  the 
two  monitor  robots.  In  order  to  help  with  the  synchronization 
between  the  two  data  sets,  we  introduce  an  identification 
number  to  the  RF  signal.  Each  time  the  coverage  robot  is 
broadcasting  the  RF  -  acoustic  pulse  combination,  a  number 
is  also  transmitted  over  the  RF.  This  number  is  incremented 
with  every  pulse.  The  monitor  robots  attach  this  number  to  the 
trilateration  reading  performed.  This  tuple  is  then  transmitted 
in  real  time  to  a  PC  that  is  connected  to  both  monitor  robots 
with  serial  cables. 

Since  the  environment  is  larger,  we  estimated  that  the 
coverage  robot  would  need  to  run  for  approximately  two  hours 
in  order  to  achieve  an  acceptable  level  of  uniformity  (K-L 
divergence  near  0.01).  During  this  time,  60,000  trilateration 
readings  were  performed  by  each  of  the  two  monitoring 
Maxelbots.  Figure  13  shows  a  graphical  representation  of  the 
two  data  sets  before  merging.  We  found  this  to  be  a  good 
test  of  the  range  and  accuracy  of  our  localization  system. 
The  localization  system  on  one  of  the  Maxelbots  performs 
better  than  the  other  one,  especially  at  distances  over  three 
meters  (note  especially  the  impossible  locations  outside  of  the 
environment). 

To  merge  the  data  and  reduce  noise,  we  divided  the  environ¬ 
ment  into  halves.  Then  we  selected  coordinates  from  the  upper 
monitor  robot  that  are  in  the  upper  half  of  the  environment. 
Similarly  we  selected  coordinates  from  the  lower  monitor 
robot  that  are  in  the  lower  half.  We  used  the  time  stamp 
to  resolve  ambiguities.  The  results  are  shown  in  Figure  14. 
Although  we  do  not  filter  the  points  shown  in  this  Figure, 
for  data  analysis  we  discard  all  the  points  that  are  outside  the 
outline  of  the  environment. 

The  frequency  of  the  occupancy  of  the  cells  for  this  envi¬ 
ronment  is  shown  in  Figure  15.  The  brighter  the  cell  the  higher 
the  number  of  visits  to  that  cell.  The  snapshots  are  taken  at 
0.2,  5,  30,  60,  and  120  minutes  and  the  exact  number  of  visits 
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Fig.  13.  Individual  localization  readings  performed  by  the  two  fixed 
Maxelbots 


Fig.  14.  Merged  localization  readings  performed  by  the  two  fixed  Maxelbots 


they  represent  are  shown  in  Table  V.  After  approximately  5 
minutes  into  the  experiment,  all  the  12  cells  in  the  environment 
were  reached,  with  cells  in  the  center  having  a  larger  number 
of  visits  compared  with  the  cells  close  to  the  two  ends.  As  the 
data  shows,  this  discrepancy  is  drastically  reduced  120  minutes 
into  the  experiment. 

The  final  Kullback-Leibler  divergence  metric  was  approxi¬ 
mately  0.015.  The  graph  of  this  metric  for  the  entire  length  of 
this  experiment  is  shown  in  Figure  16.  Although  this  wasn’t 
quite  as  low  as  expected,  it  is  still  very  reasonable.  Note 

TABLE  V 

Specific  cell  visits  and  K-L  divergence  for  the 
1.524  METERS  X  4.572  METERS  ENVIRONMENT 


Cell 

number 

Time  -  minutes  (trilateration  readings  x  100) 

0.2 

(1) 

5 

(25) 

30 

(150) 

60 

(299) 

-120 

(598) 

0 

0 

35 

779 

1616 

5242 

1 

0 

49 

899 

2472 

6102 

2 

0 

24 

1131 

2701 

5665 

3 

0 

256 

1355 

2580 

6612 

4 

5 

402 

1437 

2327 

5908 

5 

0 

265 

1561 

2103 

5168 

6 

0 

41 

660 

1397 

3512 

7 

0 

6 

572 

1944 

4487 

8 

0 

82 

1105 

2800 

4489 

9 

41 

638 

2034 

3951 

5807 

10 

54 

498 

2005 

3248 

5206 

11 

0 

204 

1462 

2761 

3877 

K-L  div. 

- 

0.469953 

0.069357 

0.035933 

0.015255 

Fig.  15.  Cell  occupancy  in  the  4.572  meters  x  1.524  meters  environment  - 
snapshots  taken  at  0.2,  5,  30,  60,  and  120  minutes 

that  there  are  no  obvious  biases  towards  edge  or  corner  cells, 
which  is  good.  It  is  clear  that  the  initial  placement  of  the 
robot  is  still  influencing  the  results,  so  steady  state  has  not 
yet  been  reached.  Note,  however,  that  the  K-L  divergence 
metric  depends  on  the  cell  granularity.  For  example,  if  this 
environment  is  treated  as  being  composed  of  three  cells  of  size 
1.524  meters  x  1.524  meters  (an  upper,  center,  and  lower  cell), 
the  K-L  divergence  drops  considerably  to  0.002.  Decreasing 
the  size  of  cells  means  it  takes  longer  to  achieve  uniformity. 

C.  L-Shaped  Environment 

We  continue  testing  our  uniform  coverage  algorithm  in 
a  more  complex  and  larger  environment,  specifically  an  L- 
shaped  concave  environment.  Not  only  does  this  environment 
contain  three  categories  of  cells  (corner,  edge,  and  an  interior 
cell),  but  achieving  uniform  coverage  on  concave  environ¬ 
ments  is  considered  to  be  difficult,  if  not  impossible  [7].  The 
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Steps 

Fig.  16.  K-L  divergence  metric  for  the  4.572  meters  x  1.524  meters 
environment 


TABLE  VI 

Cell  division  ol  the  L-shaped,  4.572  meters  x  4.572  meters, 

ENVIRONMENT 


19 

18 

17 

16 

15 

14 

13 

12 

11 

10 

9 

8 

6 

7 

4 

5 

2 

3 

0 

1 

two  sides  of  the  L-shape  are  each  4.572  meters  long,  and 
the  width  is  constant  at  1.524  meters.  With  a  cell  size  of 
0.762  meters,  this  environment  yields  a  total  of  20  equal  cells. 
The  cell  division  and  how  the  cells  are  numbered  is  shown  in 
Table  VI. 

The  picture  of  the  experimental  setup  is  shown  in  Figure  18. 
To  monitor  the  location  of  the  coverage  robot  in  this  larger 
environment,  three  monitor  robots  are  used  (highlighted  with 
circles).  As  before,  each  monitor  robot  monitors  a  partic¬ 
ular  portion  of  the  environment.  This  helps  to  ensure  that 
we  are  receiving  accurate  trilateration  readings.  Since  this 
environment  is  larger,  the  coverage  robot  was  run  for  four 
hours,  which  was  the  maximum  life  of  the  robot  battery.  The 
environment  outlined  in  Figure  17  shows  the  three  regions, 
as  well  as  a  graphical  representation  of  the  approximately 
118,000  merged  readings,  that  were  collected  during  the  four 
hour  long  experiment.  The  positions  of  the  three  monitor 
are  shown  by  the  three  circles  located  outside  the  perimeter 
and  are  labeled  ‘A3”,  “U2”,  and  “C5”.  We  use  these  labels 
to  distinguish  between  the  seven  Maxelbots  we  have.  C5  is 
responsible  for  cells  0,  1,2,  3,  4,  and  5.  Maxelbot  U2  monitors 
cells  6,  7,  11,  12,  13,  17,  18,  and  19.  Finally,  we  used  the  data 
collected  by  Maxelbot  A3  for  cells  8,  9,  10,  14,  15,  and  16. 

The  number  of  visits  to  each  of  the  cells  can  be  seen 
both  as  specific  numbers  in  Table  VII,  and  as  a  graphical 
representation  in  Figure  19.  For  this  environment  it  takes  the 
moving  Maxelbot  about  7  minutes  to  visit  all  the  cells. 


Fig.  17.  Merged  trilateration  readings  from  three  Maxelbots  for  the  L-shaped, 
4.572  meters  x  4.572  meters,  environment 


Fig.  18.  L-shaped,  4.572  meters  x  4.572  meters,  environment 

The  final  K-L  divergence  from  the  uniform  distribution 
is  0.0088,  which  is  excellent.  There  are  no  obvious  biases 
towards  any  of  the  particular  categories  of  cells.  As  shown  in 
Figure  20,  the  K-L  metric  drops  suddenly  after  approximately 
30  minutes,  meaning  that  a  good  uniform  coverage  is  already 
achieved.  After  that,  the  K-L  continues  to  drop,  but  at  a  much 
smaller  rate.  One  very  nice  aspect  of  this  experiment  was 
that  as  the  battery  power  dropped,  the  speed  of  the  robot 
also  dropped  considerably.  However,  despite  this  real-world 
condition,  the  K-L  divergence  was  not  adversely  affected. 

VII.  Summary 

In  this  article  we  have  summarized  different  prior  ap¬ 
proaches  to  robotic  coverage.  Each  of  these  approaches  has 
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TABLE  VII 

SPECIFIC  CELL  VISITS  AND  K-L  DIVERGENCE  FOR  THE  L-SHAPED,  4.572 
METERS  X  4.572  METERS,  ENVIRONMENT 


Time  -  minutes  (readings  x  100) 

Cell 

0.2 

5 

25 

-240 

number 

(1) 

(25) 

(125) 

(1180) 

0 

0 

0 

430 

5547 

1 

0 

0 

701 

5313 

2 

0 

0 

286 

6156 

3 

0 

0 

430 

6228 

4 

0 

34 

396 

5721 

5 

0 

56 

342 

7164 

6 

0 

121 

628 

6291 

7 

0 

90 

332 

5140 

8 

40 

674 

1211 

7145 

9 

0 

392 

1366 

6737 

10 

0 

77 

964 

7314 

11 

0 

108 

619 

7361 

12 

0 

163 

643 

6646 

13 

0 

124 

607 

5200 

14 

59 

259 

653 

5898 

15 

0 

117 

862 

6372 

16 

0 

106 

995 

7931 

17 

0 

28 

396 

5090 

18 

0 

132 

597 

5978 

19 

0 

135 

655 

5292 

K-L  divergence 

- 

- 

0.090440 

0.008751 

(c)  -  25  minutes  (d)  -  240  minutes 


Fig.  19.  Cell  occupancy  in  the  L-shaped  4.572  meters  x  4.572  meters 
environment  -  snapshots  taken  at  0.2,  5,  25,  and  240  minutes 


Fig.  20.  K-L  divergence  metric  for  the  L-shaped,  4.572  meters  x  4.572 
meters,  environment 


its  own  advantages  and  disadvantages.  Traditional  complete 
coverage  algorithms  offer  the  guarantee  that  an  environment  is 
completely  covered,  but  they  require  more  expensive  localiza¬ 
tion  hardware.  Some  of  them  also  require  knowing  the  shape 
of  the  environment  to  be  covered  in  advanced.  Also,  prior 
research  has  shown  that  heuristic  and  randomized  strategies 
are  more  effective  solutions  when  expensive  sensor  hardware 
fails  to  perform  with  100%  accuracy. 

One  difficulty  with  any  approach  that  does  not  include 
environmental  knowledge  is  that  it  is  impossible  to  fully  char¬ 
acterize  transient  behavior,  such  as  how  long  it  will  take  for 
a  coverage  robot  to  move  from  one  end  of  an  environment  to 
another.  For  example,  the  “22cell”  environment  in  Appendix  A 
contains  two  chambers  connected  via  a  corridor.  A  robot  will 
generally  spend  a  lot  of  time  within  a  chamber  before  crossing 
to  the  other.  One  nice  aspect  of  our  algorithm  is  that  we  can 
encourage  longer  traversals  simply  by  increasing  the  mean  free 
path  constant  /.  Hence  the  robot  will  transition  between  one 
chamber  and  another  more  often.  There  will  be  a  degradation 
in  the  uniformity  of  a  coverage,  but  this  might  be  worthwhile 
if  one  is  trying  to  spot  an  intruder.  Hence  /  can  be  modified 
to  provide  a  balance  between  uniformity  and  long  traversals. 

Another  advantage  of  our  algorithm  is  that  it  is  analyzable 
if  the  environment  is  known  (either  a  priori,  or  via  “SLAM” 
techniques  [13]).  At  this  point  the  Q  matrix  can  be  generated 
and  transient  behavior  can  be  computed.  This  is  generally  more 
difficult  for  heuristic  techniques. 

In  summary,  our  uniform  coverage  algorithm  overcomes  the 
limitations  of  prior  approaches.  It  is  a  randomized  theory- 
based  algorithm  that  is  complete,  unpredictable,  and  covers 
the  periphery.  The  algorithm  requires  no  knowledge  about  the 
environment.  Estimating  the  cell  size  can  be  based  on  the  robot 
platform  size  (for  cleaning  operations)  or  sensing  capabilities 
(for  surveillance  operations),  rather  than  environmental  con¬ 
siderations.  Our  algorithm  also  extends  naturally  to  multiple 
robots.  The  performance  of  our  algorithm  was  shown  both  in 
simulation  and  real-world  experiments.  The  Kullback-Leibler 
divergence  metric  we  used  confirms  that  our  algorithm  is  very 
close  in  achieving  optimal  uniform  coverage. 
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Appendix  A 

The  main  while  loop  of  the  Maxelbot  is  shown  in  Listing  6. 
Every  second  the  obstacle  avoidance  module  is  polled  and  a 
move  counter  is  incremented.  If  an  object  is  detected  within 
the  sensor  response  range,  a  random  turn  is  made  clockwise 
or  counterclockwise,  and  the  move  counter  is  reset.  If  the 
move  counter  is  equal  to  two,  a  random  turn  is  also  made 
(corresponding  to  the  mean  free  path  length).  Otherwise  the 
robot  continues  to  move  forward. 


Listing  6.  The  main  body  of  the  Maxelbot  source  code  for  the  uniform 
coverage  algorithm _ 
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while  (1) 

{ 

//  1  second  delay 

delay (  150  ) ; 
move_counter++; 

//  get  a  fresh  reading  from  the  OAM 

ask_the_OAM (  0AM_I2C_ADDRESS  ); 

//  wait  for  ALL  the  data  to  be  received 
//  from  the  OAM 

while (  i 2 c_com_in_pr ogress  )  ; 

while  (  get_inches (  sensors [0]  )  <=  SENSOR_RANGE 

get_inches (  sensors [1]  )  <=  SENSOR_RANGE 

move_counter  ==  2  )  //  Mean  free  path 

{ 

//  rand()  returns  values  from  0  to  32767 

degs  =  (  int  )  (  rand ( )  %  180  +  1  )  ; 
if  (  degs  %  2  )  //  Direction  of  turn 

dir  =  1; 
else 

dir  =  0; 

turn  (  dir,  degs  )  ; 
move_counter  =  0; 
ask_the_OAM (  0AM_I2C_ADDRESS  ); 
while  (  i 2 c_com_in_pr ogress  )  ; 


} 

//  Move  forward 

forward (  1 ,  20  ) ; 
forward (  0 ,  2  0  )  ; 


//  Left  motors 
//  Right  motors 


} 


I 


The  output  of  the  environment  generator  presented  in  Fi¬ 
gure  2  is  shown  in  Listing  7.  First,  the  total  number  of  vertices 
is  shown,  followed  by  the  coordinates  of  each  vertex,  as 
selected  by  the  user.  This  environment  is  split  into  47  equal- 
size  square  cells,  each  cell  being  20  x  20  units  (cell  size  is 
shown  on  the  last  line  of  the  output).  The  coordinates  of  the 
lower  left  corner  for  each  of  the  47  cells  of  this  environment 
are  then  presented.  No  decrement  factor  was  used  in  this 
example. 
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7.  Data  generated  by  the  environment  generator  for  the  arbitrary 
ment  presented  in  Figure  2 
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)ur  uniform  coverage  algorithm  was  tested  in  simulation 
numerous  environments.  In  Section  IV  we  present  the 
erimental  results.  The  outlines  of  the  tested  environments 
presented  in  Figures  21,  22,  and  23.  Another  environment 
mple  is  shown  in  Figure  3. 


(a)  -  3x1  (cell  size  of  30) 


(b)  -  4cellL  (cell  size  of  30) 


(c)  -  4cellRoll  (cell  size  of  30) 


Fig.  21.  Examples  of  environments  tested  with  the  uniform  coverage 
algorithm 
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Fig.  22. 
algorithm 


(a)  -  7cell  (cell  size  of  30) 


(a)  -  35cell  (cell  size  of  30) 


(b)  -  36cell  (cell  size  of  30) 


(c)  -  82cell  (cell  size  of  25) 


(d)  -  10x10  (cell  size  of  30) 


Examples  of  environments  tested  with  the  uniform  coverage 


Fig.  23.  Examples  of  environments  tested  with  the  uniform  coverage 
algorithm 


